OpenClaw 企业级部署完整方案
发布日期: 2026-03-18
分类: 企业解决方案
标签: OpenClaw, 企业,部署,高可用,安全,运维
场景:100 人团队的企业部署
需求
- 用户规模: 100 人,日均活跃 60 人
- 并发量: 峰值 20 并发会话
- 可用性: 99.9%(每月宕机 <43 分钟)
- 安全性: 数据加密、访问控制、审计日志
- 扩展性: 支持未来扩展到 500 人
挑战
- 单点故障风险
- 性能瓶颈
- 数据安全
- 运维复杂度
- 成本控制
架构设计
整体架构
┌─────────────────┐
│ 负载均衡器 │
│ (Nginx HA) │
└────────┬────────┘
│
┌──────────────┼──────────────┐
│ │ │
┌────────▼────────┐ │ ┌────────▼────────┐
│ OpenClaw 节点 1 │ │ │ OpenClaw 节点 2 │
│ (主节点) │ │ │ (备节点) │
└────────┬────────┘ │ └────────┬────────┘
│ │ │
└─────────────┼─────────────┘
│
┌─────────▼─────────┐
│ Redis 集群 │
│ (会话存储) │
└─────────┬─────────┘
│
┌─────────▼─────────┐
│ PostgreSQL │
│ (数据持久化) │
└─────────┬─────────┘
│
┌─────────▼─────────┐
│ 对象存储 │
│ (文件/日志) │
└───────────────────┘技术选型
| 组件 | 选型 | 说明 |
|---|---|---|
| 负载均衡 | Nginx + Keepalived | 高可用 |
| 应用服务器 | OpenClaw × 2 | 主备模式 |
| 会话存储 | Redis Cluster | 分布式缓存 |
| 数据库 | PostgreSQL 15 | 数据持久化 |
| 对象存储 | MinIO | 自建 S3 兼容 |
| 监控 | Prometheus + Grafana | 指标监控 |
| 日志 | ELK Stack | 日志分析 |
| 部署 | Docker + K8s | 容器化 |
部署步骤
步骤 1:服务器规划
| 服务器 | 配置 | 用途 | 数量 |
|---|---|---|---|
| app-server | 4C8G | OpenClaw 应用 | 2 |
| db-server | 8C16G | PostgreSQL | 2 (主从) |
| cache-server | 4C8G | Redis | 3 (集群) |
| storage-server | 8C32G | MinIO | 3 (分布式) |
| lb-server | 2C4G | Nginx | 2 (主备) |
云服务商推荐:
- 阿里云:经济实惠
- 腾讯云:性价比高
- AWS:全球部署
步骤 2:Docker 容器化
dockerfile
# Dockerfile
FROM node:22-alpine
WORKDIR /app
# 安装 OpenClaw
RUN npm install -g openclaw
# 复制配置
COPY openclaw.json /root/.openclaw/
COPY workspace/ /root/.openclaw/workspace/
# 健康检查
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
CMD openclaw gateway status || exit 1
# 启动
CMD ["openclaw", "gateway", "start", "--foreground"]yaml
# docker-compose.yml
version: '3.8'
services:
openclaw-primary:
build: .
container_name: openclaw-primary
environment:
- NODE_ENV=production
- OPENCLAW_CONFIG=/root/.openclaw/openclaw.json
volumes:
- ./config:/root/.openclaw
- ./workspace:/root/.openclaw/workspace
- ./logs:/root/.openclaw/logs
networks:
- openclaw-net
deploy:
replicas: 1
restart_policy:
condition: on-failure
healthcheck:
test: ["CMD", "openclaw", "gateway", "status"]
interval: 30s
timeout: 10s
retries: 3
openclaw-replica:
build: .
container_name: openclaw-replica
environment:
- NODE_ENV=production
volumes:
- ./config:/root/.openclaw
- ./workspace:/root/.openclaw/workspace
- ./logs:/root/.openclaw/logs
networks:
- openclaw-net
depends_on:
- openclaw-primary
deploy:
replicas: 1
redis:
image: redis:7-alpine
command: redis-server --appendonly yes
volumes:
- redis-data:/data
networks:
- openclaw-net
postgres:
image: postgres:15-alpine
environment:
- POSTGRES_DB=openclaw
- POSTGRES_USER=openclaw
- POSTGRES_PASSWORD=${DB_PASSWORD}
volumes:
- postgres-data:/var/lib/postgresql/data
networks:
- openclaw-net
nginx:
image: nginx:alpine
ports:
- "80:80"
- "443:443"
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf
- ./ssl:/etc/nginx/ssl
networks:
- openclaw-net
depends_on:
- openclaw-primary
- openclaw-replica
volumes:
redis-data:
postgres-data:
networks:
openclaw-net:
driver: bridge步骤 3:Nginx 负载均衡配置
nginx
# nginx.conf
upstream openclaw_backend {
least_conn;
server openclaw-primary:8080 weight=3;
server openclaw-replica:8080 weight=2;
keepalive 32;
}
server {
listen 80;
server_name openclaw.company.com;
return 301 https://$server_name$request_uri;
}
server {
listen 443 ssl http2;
server_name openclaw.company.com;
ssl_certificate /etc/nginx/ssl/fullchain.pem;
ssl_certificate_key /etc/nginx/ssl/privkey.pem;
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers HIGH:!aNULL:!MD5;
# 安全头
add_header X-Frame-Options "SAMEORIGIN" always;
add_header X-Content-Type-Options "nosniff" always;
add_header X-XSS-Protection "1; mode=block" always;
add_header Strict-Transport-Security "max-age=31536000" always;
location / {
proxy_pass http://openclaw_backend;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# 超时设置
proxy_connect_timeout 60s;
proxy_send_timeout 60s;
proxy_read_timeout 60s;
# 缓冲设置
proxy_buffering off;
}
# 健康检查端点
location /health {
access_log off;
return 200 "healthy\n";
add_header Content-Type text/plain;
}
}步骤 4:高可用配置
Keepalived 配置
bash
# /etc/keepalived/keepalived.conf (主节点)
vrrp_script check_nginx {
script "killall -0 nginx"
interval 2
weight 2
}
vrrp_instance VI_1 {
state MASTER
interface eth0
virtual_router_id 51
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.1.100
}
track_script {
check_nginx
}
}bash
# /etc/keepalived/keepalived.conf (备节点)
vrrp_instance VI_1 {
state BACKUP
interface eth0
virtual_router_id 51
priority 90
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.1.100
}
}步骤 5:数据备份策略
bash
#!/bin/bash
# backup.sh - 自动备份脚本
BACKUP_DIR="/backups/openclaw"
DATE=$(date +%Y%m%d_%H%M%S)
RETENTION_DAYS=30
# 备份数据库
pg_dump -h localhost -U openclaw openclaw | gzip > $BACKUP_DIR/db_$DATE.sql.gz
# 备份配置文件
tar -czf $BACKUP_DIR/config_$DATE.tar.gz \
/root/.openclaw/openclaw.json \
/root/.openclaw/workspace/
# 备份凭证(加密)
gpg --symmetric --cipher-algo AES256 \
--batch --passphrase "$BACKUP_PASSPHRASE" \
/root/.openclaw/credentials/*.json \
--output $BACKUP_DIR/credentials_$DATE.json.gpg
# 上传到对象存储
aws s3 cp $BACKUP_DIR s3://openclaw-backups/$DATE/ \
--recursive --endpoint-url http://minio:9000
# 清理旧备份
find $BACKUP_DIR -name "*.gz" -mtime +$RETENTION_DAYS -delete
echo "✅ 备份完成:$DATE"json
{
"name": "每日备份",
"schedule": {
"kind": "cron",
"expr": "0 2 * * *",
"tz": "Asia/Shanghai"
},
"payload": {
"kind": "systemEvent",
"text": "执行 /opt/openclaw/backup.sh"
}
}监控告警
Prometheus 配置
yaml
# prometheus.yml
global:
scrape_interval: 15s
scrape_configs:
- job_name: 'openclaw'
static_configs:
- targets: ['openclaw-primary:8080', 'openclaw-replica:8080']
metrics_path: '/metrics'
- job_name: 'nginx'
static_configs:
- targets: ['nginx:9113']
- job_name: 'redis'
static_configs:
- targets: ['redis:6379']
- job_name: 'postgres'
static_configs:
- targets: ['postgres:9187']Grafana 仪表板
关键指标:
应用层
- 请求量 (QPS)
- 响应时间 (P95, P99)
- 错误率
- 活跃会话数
系统层
- CPU 使用率
- 内存使用率
- 磁盘使用率
- 网络流量
业务层
- 日活跃用户
- API 调用量
- Token 消耗量
- 成本统计
告警规则
yaml
# alert_rules.yml
groups:
- name: openclaw_alerts
rules:
- alert: HighErrorRate
expr: rate(http_requests_total{status=~"5.."}[5m]) > 0.05
for: 5m
labels:
severity: critical
annotations:
summary: "错误率过高"
description: "错误率超过 5%"
- alert: HighResponseTime
expr: histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m])) > 2
for: 10m
labels:
severity: warning
annotations:
summary: "响应时间过长"
description: "P95 响应时间超过 2 秒"
- alert: HighMemoryUsage
expr: (node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes) / node_memory_MemTotal_bytes > 0.9
for: 5m
labels:
severity: warning
annotations:
summary: "内存使用率过高"
description: "内存使用率超过 90%"
- alert: InstanceDown
expr: up{job="openclaw"} == 0
for: 1m
labels:
severity: critical
annotations:
summary: "实例宕机"
description: "{{ $labels.instance }} 已宕机"安全加固
1. 网络安全
bash
# 防火墙配置
ufw default deny incoming
ufw default allow outgoing
ufw allow 22/tcp # SSH
ufw allow 443/tcp # HTTPS
ufw allow from 10.0.0.0/8 to any port 5432 # 内网访问数据库
ufw allow from 10.0.0.0/8 to any port 6379 # 内网访问 Redis
ufw enable2. 访问控制
json
{
"security": {
"authRequired": true,
"allowedIPs": ["10.0.0.0/8", "192.168.0.0/16"],
"rateLimit": {
"requests": 100,
"windowMs": 60000
},
"sessionTimeout": 3600000
}
}3. 审计日志
javascript
// 审计日志配置
{
"logging": {
"level": "info",
"audit": {
"enabled": true,
"events": [
"login",
"logout",
"config_change",
"credential_access",
"data_export"
],
"retention": 90
}
}
}成本估算
月度成本(100 人团队)
| 项目 | 配置 | 费用 |
|---|---|---|
| 应用服务器 | 4C8G × 2 | 800 元 |
| 数据库 | 8C16G × 2 | 1600 元 |
| 缓存 | 4C8G × 3 | 1200 元 |
| 存储 | 8C32G × 3 | 2400 元 |
| 负载均衡 | 2C4G × 2 | 400 元 |
| 大模型 API | 10 万次调用 | 3000 元 |
| 带宽 | 100Mbps | 500 元 |
| 总计 | 9900 元/月 |
人均成本
- 9900 元 ÷ 100 人 = 99 元/人/月
ROI 分析
假设每人每天节省 1 小时,时薪 50 元:
- 月度节省:100 人 × 22 天 × 1 小时 × 50 元 = 110,000 元
- 投入成本:9,900 元
- ROI: 1011%
运维手册
日常巡检
bash
#!/bin/bash
# daily_check.sh
echo "=== OpenClaw 日常巡检 ==="
# 1. 服务状态
echo "1. 服务状态:"
docker ps | grep openclaw
# 2. 资源使用
echo "2. 资源使用:"
top -bn1 | head -10
# 3. 磁盘空间
echo "3. 磁盘空间:"
df -h
# 4. 错误日志
echo "4. 最近错误:"
tail -50 /var/log/openclaw/error.log | grep ERROR
# 5. 备份状态
echo "5. 备份状态:"
ls -lt /backups/openclaw/ | head -5故障处理
场景 1:主节点宕机
bash
# 1. 检查状态
kubectl get pods
# 2. 查看日志
kubectl logs openclaw-primary
# 3. 重启服务
kubectl rollout restart deployment/openclaw
# 4. 验证恢复
curl https://openclaw.company.com/health场景 2:数据库连接失败
bash
# 1. 检查数据库状态
kubectl get pods | grep postgres
# 2. 检查连接
psql -h localhost -U openclaw -d openclaw -c "SELECT 1"
# 3. 查看数据库日志
kubectl logs postgres-primary
# 4. 重启数据库
kubectl rollout restart statefulset/postgres总结
企业级部署关键要点:
- ✅ 高可用架构 - 负载均衡 + 主备节点
- ✅ 数据持久化 - 数据库 + 对象存储
- ✅ 监控告警 - 实时监控 + 自动告警
- ✅ 安全加固 - 网络隔离 + 访问控制 + 审计
- ✅ 备份策略 - 自动备份 + 异地存储
- ✅ 运维手册 - 标准化流程
适用场景:50 人以上团队、生产环境、关键业务
相关文档:
- [安全最佳实践](/guide/daily-openclaw-安全实践 -20260318)
- [性能优化指南](/guide/daily-openclaw-性能优化 -20260318)
- [成本优化实战](/guide/daily-openclaw-成本优化实战 -20260318)